LogMerge is a utility to merge log files by date. It reads from any of several
formats and writes to any of serveral format. It can also filter out entries
based on host name, file requested, agent, auth. user, or server name. This is
useful for getting logs from a server cluster into a form where Summary can
correctly count the visits.

You run LogMerge from the DOS command line. Invoke it with a list of log file
names. The merged result will be written to a file called 'merged.gz' or
'merged.log'.

LogMerge will read logs in any mixture of WebSTAR Extended format, WebSTAR
ExLF format, Microsoft IIS format, Microsoft W3C ExLF format, and any of
several other formats. LogMerge can write NCSA Combined, WebSTAR Extended
format, or Microsoft W3C ExLF format. If possible you should use the same
format for output as the logs you are merging. Input logs can be plain text
files or compressed with Zip, GZip, or BZip2.

LogMerge has a configuration file named "logmerge.cfg" which must be in the
same folder as the LogMerge application. There are several configuration
commands that control the translation process. Most users will be able to use
the default settings.

Lines in the configuration file starting with a '#' are comments.

'compress' - yes or no. Controls compression of the translated log file. If set
to yes the log file will be written with GZip compression and an extension of
'.gz'. Otherwise it will be written as a text file with an extension of '.log'.
The default is yes.

'level' - a single digit from 0 to 9. Controls the level of compression used on
translated log files when compression is on. This number is passed to the ZLib
code. Higher numbers result in better compression and longer run times. The
default is 9.

'vhost' - yes or no. Control the contents of the second field of log entries
when using NCSA Combined format for output. NCSA Combined logs officialy use
this field for the user e-mail address, but this usage is very rare for security
reasons. WebTrends can read a virtual server name from this field. Set vhost to
yes to put the virtual server name in this field instead of leaving it blank.
The default is yes.

'verbose' - yes or no. Controls verbose progress messages. Yes means print
progress messages. The default is yes on the Mac and no on other platforms.

'format' - user log format string. Allows you to setup a user log format string
for parsing the input log file. The format of the string is the same as user log
formats in Summary. See <http://summary.net/manual/log_formats.html> for a
comlete discussion of supported log formats and custom log format strings.
FileMaker Pro brief format needs a user format command of
"format DATE-MDY TIME-12 HOST URI".

'filter-host'
'filter-request'
'filter-agent'
'filter-server'
'filter-user' - set pattern match string. These directives can appear more than
once with different pattern strings. If the corresponding field matches the
pattern that log entry is discarded. Patterns can contain '*' which matches any
sequence of characters. A '?' will match any single character. Use a leading
backslash '\' to test for a '*' or '?'. For example '\*' will match a '*'
character instead of any sequence of characters. If a pattern starts with a '+'
any entry which matches the pattern is kept. Patterns are tested in the order
they appear in the configuration file and the first one to match an entry is
used. Thus if you say 'filter-host +ahost' and then say 'filter-host *', any
entry with a host name of 'ahost' is kept and all others are discarded.

'trim-first-dir' - a single digit from 0 to 9. When non-zero this option causes
the first N directory names to be removed from all requests. This is handy when
filtering down to a single virtual domain, the domain is often indicated by the
first directory name(s) which are redundant after the filtering. Removing it
provides a cleaner report and URLs more like what a user types in most
situations. The default is 0.

'newlines' - mac unix or dos. This controls the newline format in the output.
Mac newlines are a single CR. Unix newlines are a single NL. DOS/Windows
newlines are a CR LF sequence. You can output DOS newlines directly, saving a
seperate conversions step when transfering to a PC. THe deafult is to use the
newline format of the machine LogMerge is being run on.

'output-format' - webstar ncsa or w3c. This controls the output log format.
'webstar' is WebSTAR Extended format. 'ncsa' is NCSA Combined format. 'w3c' is
Microsoft W3c ExLF format. If possible, it is best to use the same output format
as the format of the logs you are merging.

'lowercase' - yes or no. When set to yes, this will lowercase all request
strings. This is useful when the server ignores the case of the request.

'normalize' - yes or no. Normalizing puts all of the fields into a standard
format. It lowercases domain names, removes proxy server information tags from
agent strings, and several other transformations. This is the same process
Summary uses before analyzing a log entry.



LogMerge is shareware.  It is copyright  2000-2001 by Jason T. Linhart.  I give
you permission to try it out for thirty days. After thirty days you must either
register the program or stop using it. Beta versions may be used freely until
they expire.  You may give out copies, so long as you include all of the files
in the original package without modification. People recieving the copies are
subject to these terms. It may not be sold or commercially distributed without a
written licence from me.  It may be included in archives, and distributed on
CD-ROM or on other formats so long as there are no charges for these services
other than shipping, handling and the cost of media.  Use or distribution of
LogMerge indicates your agreement to these terms.

GZip file IO is provided by the ZLib library from Jean-loup Gailly
<jloup@gzip.org> and Mark Adler <madler@alumni.caltech.edu>. The zlib home page
is: <http://www.gzip.org/zlib/>. Zip file parsing is provided by Gilles Vollant,
and is available as part of the ZLib package.

BZip2 file IO is provided by the libbzip2 library from Julian Seward,
<jseward@acm.org>. The BZip2 home page is: <http://sourceware.cygnus.com/bzip2/>.

For the latest information and updates check:
<http://summary.net/soft/logmerge.html>.

Program by:
Jason T. Linhart
<http://summary.net/>
jason@summary.net
